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WILLIAMS COHERENCE AND BEYOND 
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Abstract. In this paper we discuss the consistency concept of Williams co¬ 
herence for imprecise conditional previsions, presenting a variant of this notion, 
which we call W-coherence. It is shown that W-coherence ensures important 
consistency properties and is quite general and well-grounded. This is done 
comparing it with alternative or anyway similar known and less known consis¬ 
tency definitions. The common root of these concepts is that they variously 
extend to imprecision the subjective probability approach championed by de 
Finetti. The analysis in the paper is also helpful in better clarifying several 
little investigated aspects of these notions. Keywords. Conditional lower 
previsions, Williams coherence, envelope theorem, centered convex previsions, 
conglomerability. 
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1. Introduction 

Quite recently, P.M. Williams’ 1975 seminal paper Notes on conditional previ¬ 
sions was published in a slightly revised version [29j . preceded by an introductory 
paper discussing basic aspects and historical motivations for his work [25]. This 
fact confirms that Williams’ ideas on coherence still play a very important role 
in the theory of conditional imprecise previsions. In the past, they influenced the 
more widespread theory developed by Walley [56]. Williams coherence was also di¬ 
rectly used in some papers to achieve results in different areas, including epistemic 
independence [24] , problems of checking consistency for conditional imprecise prob¬ 
abilities m, consistency for unbounded random variables [23] . Yet, certainly also 
because of its overall limited diffusion in the scientific community, several aspects 
of Williams coherence are still little explored. 

A basic motivation for studying Williams coherence is its generality: in the 
version we present in the paper, it extends to a very broad conditional setting 
Walley’s (unconditional) coherence, which already encompasses as special cases 
several uncertainty theories (2-monotone probabilities, precise probabilities, belief 
functions, possibility/necessity measures, coherent risk measures,...) applied in 
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many different areas, from artificial intelligence to statistics or risk measurement. 
Thus extensions of such theories to conditional frameworks can be accommodated 
into Williams coherence, exploiting hence the results already established for it. In 
many cases, these problems have been so far little investigated; for instance, much 
work remains to be done in the area of measuring conditional risks. Williams 
coherence is not the only way of extending Walley’s (unconditional) coherence, but 
it is a very general and (perhaps) immediate one; it is anyway important to weigh 
pros and cons in choosing which coherence notion should be used. This evaluation 
affects various issues, some more familiar (like the validity of the envelope theorem), 
other ones generally less familiar (like the problem of non-conglomerability). 

The main purpose of this paper (extending earlier results in [16!, Section 3) 
is to investigate more closely the role of Williams coherence, comparing it with 
the nearest consistency concepts that have been developed in the literature. Since 
Williams’ work was inspired by de Finetti’s ideas, these concepts are among those 
following and generalising the subjective probability approach to uncertainty. We 
supply some historical information on this in Section 12.11 The hints presented 
there are historically not exhaustive, being limited to key contributions and under 
the perspective of studying Williams coherence, but they let us mention a few 
important properties regarding all of these concepts, which form a basis for making 
comparisons among them in the paper. Section 12.21 contains other preliminary 
definitions and notions. 

We investigate Williams coherence with a progressively larger perspective through¬ 
out the paper. We start in Section [3.1 1 bv discussing a nimbler variant for it, called 
W-coherence (already defined in Section [2~2ll . which is adopted in the sequel. It 
generalises Walley coherence for unconditional lower previsions. In Section [3.21 we 
discuss potential generalisations of other unconditional coherence concepts, focusing 
in particular on a little explored definition [2 HD]. We supply an interesting inter¬ 
pretation for it, showing that it has no straightforward extension to conditionals, 
but deriving some conditions that are either necessary or sufficient for W-coherence. 
In Section [3. 31 W-coherence is compared with alternative views of conditional coher¬ 
ence developed by Walley [26], proving in particular its equivalence with separate 
coherence (when they are comparable, since separate coherence is less general). The 
comparison is continued in Section l3~4l discussing non-conglomerability and the dif¬ 
ferent treatment of this property in Williams’ and Walley’s approaches. Concepts 
related to W-coherence are discussed in Section 0] In particular, in Section ro we 
explore a notion intermediate between W-coherence and dF-coherence, showing its 
little significance, while in Section 14.21 we discuss which condition of avoiding loss 
- type should be appropriate when adopting W-coherence. It is shown that using 
a certain concept of avoiding uniform loss some seemingly inconsistent features of 
W-coherence pointed out by Walley can be justified. In Section l4~3l we discuss cen¬ 
tered convexity, a relevant concept, (moderately) weaker than W-coherence. We 
point out that centered convexity shares desirable properties with W-coherence, 
even though this is true at a lesser extent as far as envelope theorems are con¬ 
cerned. Our conclusions on the role of W-coherence in imprecise probability theory 
are contained in Section [5] 
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2. Preliminary Issues 

We recall first a few notions concerning the description of uncertainty. Following 
01 and others, we use the logical notation to operate with events. This originates 
from observing that events are described by propositions of classical logic, and 
actually a formal definition of events and conditional events in these terms was 
given in mm- We write B for both an event B and its indicator function |B| 
(de Finetti’s convention), appearing from the context which of the two meanings is 
intended. 

A bounded random variablcQ X is represented by a map X : B —>• R, where B is a 
partition of (non-impossible) events. A possible value of X, X{ui), corresponds to 
each u! £ B, which does not mean that B is unique. For instance, the partition whose 
generic event is l X = x' will do if we describe X alone, but a more refined partition 
is needed to describe two or more random variables simultaneously. In classical 
probability theory, a unique fixed partition (called 12 there, while we reserve the 
symbol f l for the sure event), large enough to describe what matters, is employed. 
This is not necessary in general (cf. the discussion in [26], Section 2.1.4) and will 
not be pursued here. 

When conditioning on some non-impossible event B 1 the conditional random 
variable X\B may be represented by : B\B — > R, where the elements of the 
conditional partition B\B are obtained replacing each u> £ B with the conditional 
event u>\B, and discarding those uj\B which turn out to be impossible, conditional 
on B (i.e., such that assuming B true implies that lo is false). After this is done, 
Xb(uj\B) = X(u>) holds. In the special case that B = {12}, we reobtain uncondi¬ 
tional random variables (A|f2 = X). 

The supremum sup(A|B) of X\B may be computed as sup w ^, B X(w) (in the 
set-theoretic language: sup wgB 

When working with conditional random variables, we shall sometimes employ 
the equality 

(1) f(X 1 ,.. .,X n )\B = /(AAIB,.. .,X n \B) 

where / is any real function, returning the random variable /( X\,...,X n ) as a 
function of Xi,... , X n 0. A typical case we will consider in the paper is f = G, 
where G is a ‘gain’. 

In the rest of the paper, the domain of the uncertainty measures considered is 
usually termed T>. Precisely, T> is an arbitrary (non-empty) set of bounded random 
variables, or more generally of bounded conditional random variables. T> may 
contain conditional events too, corresponding to those X\B £ T> such that X is the 
indicator of some event, or events when further B = 12. 

A lower prevision F on D is a map P : T> R. An upper prevision P may be 
defined through the equality P(—X\B) = —P(X\B) VA| B £ V, which always lets 
us refer to either lower or upper previsions only. A precise prevision P corresponds 
to the special case P(X) = P(X) = P(X). 

2.1. A Historical Note. We shall deal in this paper with several notions of ‘co¬ 
herence’, or weaker concepts. Their forerunner is de Finetti’s coherence for (uncon¬ 
ditional) precise previsions [8]: 

^Also called gamble in m or bounded random quantity in m, whilst random quantities can 
be unbounded in [8]. 
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Definition 1. P : V —> R is a dF-coherent precise prevision on T> iff, Vn,m £ N, 

V Xi,..., X n , Yi,, Y m £ V, V si,...,s n > 0, V n,... ,r m > 0, defining G = 

Ya =i s i( x i ~ - J2JLi fj(Yj - P(Yj)), it holds that supG > 0. 

This definition includes that of dF-coherent (precise) probability as a special case, 
when all the random variables in T> are (indicators of) events. If further T> is an 
algebra, dF-coherent probabilities coincide with finitely additive probabilities. 

The notion of dF-coherent precise prevision is also closely related to that of ex¬ 
pectation: whenever an expectation E(X ) is assessed for X , then E(X) is also its 
only dF-coherent prevision. However, whenever P(X) is assessed E(X) is not nec¬ 
essarily defined, because no probability on all events l X < x ’ must be preliminarily 
elicited in order to define P(X). 

Although de Finetti did not develop extensively a theory of conditional previ¬ 
sions, nor was he much concerned with imprecise previsions, several features in 
his approach were influential also in most later generalisations. We mention the 
following basic facts, referring to a generic, not specified ‘consistency’ property of 
(precise or imprecise) previsions. 

A) Previsions are announced on an arbitrary (non-empty) set of random vari¬ 
ables T >, and consequently the definition of their consistency is structure- 
free. 

B) An extension theorem ensures that a consistent prevision can be extended 
on any V Z> T> , so that the extension preserves the same type of consistency 
on V. 

C) Consistent previsions have a behavioural interpretation in some idealised 
betting scheme. 

When ‘consistency’ is replaced by ‘dF-coherence’, A), B) and C) are satisfied!! Con¬ 
cerning the betting scheme, the random variable G in Definition [T| is the gain from 
a bet made up of n + m elementary bets, n ‘in favour of’ X \,..., X n (the bettor is 
willing to pay SiP^Xf) for receiving sWo i = 1, - - -, n), m ‘against’ Y \,..., Y m (the 
bettor receives rjP(Yj) to sell rjYj, j = 1,..., to). The definition of dF-coherence 
requires that, whatever is the bet, the gain cannot be negative and bounded away 
from 0. DF-coherent previsions are linear and homogeneous, if the relevant quan¬ 
tities are in the domain V: 

(2) P(aX + (3Y) = aP(X) + 0P(Y). 

When adding the constraint m < 1 in Definition [l] we obtain Walley’s definition of 
coherence for lower previsions: 

Definition 2. P : T> —> R is a coherent lower prevision on V iff, for all n £ N, 

V X 0 , X u ..., X n £ V, V s 0 , si,..., s n > 0, defining G = Yn= i s i( x i ~ Ei x i )) ~ 
So(Afo — iO(Afo)), it holds that supG > 0. 

Again, items A), B) and C) above are satisfied by this definition, ft has a 
well-known behavioural interpretation, discussed in [25] , Some consequences of 
this interpretation, not all highlighted in [2B1 . may better stress the behavioural 
difference with Definition [l] Precisely, P(X) is an agent’s supremum buying price 

2 The dF-coherent extension is generally not unique. In the special case that the events ‘X < x’ 
are in T> \/x G R, while X (£ T>, the dF-coherent extension on T> U{X} is unique, and as mentioned 
above coincides with E(X). 
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for X, and G is the agent’s gain resulting from her/his buying SjXi, for i = 1 ,,n, 
and selling so^o- Coherence implies, writing the last term in G as soP(Xo) — sqXo, 
that the agent may be forced to accept at most one of her/his supremum buying 
prices, SqP_(Xq), as an infimunr selling price for sqXq. The restriction ‘at most one’ 
does not apply to Definition [l] because m there may be any natural number. In 
general, we shall say that the agent bets on (in favour or against) X with stake s. 
With imprecise previsions, there is a fourth property that we will consider: 

D) consistent imprecise previsions are characterised by some envelope theorem. 

Generally speaking, envelope theorems relate a function in a certain set T to a 
set V of other functions with well specified features. These theorems either ensure 
that by performing the (pointwise) infimum or supremum on the elements of V we 
get a function / S T, or else guarantee that every / £ T may be expressed as an 
infimum or supremum over some set V , or both (thus characterising the functions 
in JF). Envelope theorems may be found in many different research areas, like for 
instance cooperative games m- They are important because: 

• they ensure an often simple way of assigning a function / with the desired 
consistency properties; 

• when being also characterisation theorems, they allow an alternative, indi¬ 
rect definition and interpretation of the functions in J- by means of sets of 
the (usually simpler) functions in V. Moreover, they allow proving prop¬ 
erties of the functions in T using known results about the functions in 
V. 

Coherent lower previsions ensure property D): a real function P is a coherent lower 
prevision over V if and only if P(X) = infp g y>{P(X )},MX e T> (inf is attained), 
where V is a set of dF-coherent precise previsions P dominating P on T>, i.e. 

p(x) > p(x) vx g vyp g p ng. 

Various generalisations of dF-coherence to conditional (precise or imprecise) pre¬ 
visions have been proposed. The adherence of some of them to A), B) and D) will be 
discussed throughout the paper. As for C), all of them have some behavioural inter¬ 
pretation. This aspect will therefore be just outlined. In particular, dF-coherence 
for conditional (precise) previsions was developed in the eighties in [5J [T7], obeying 
the requirements A), B), C) above. 

Definition 3. P : T> —» R is a dF-coherent conditional (precise) prevision on D 
iff, for all n,me N, V X 1 \B 1 ,...,X n \B n ,Y 1 \C 1 ,.. .,Y m \C m eD,Vs,>0(i = 
1,.. ,,n), V rj > 0 (j = 1,..., to), defining G = - P(Xj|Bj)) - 

r jCj(Yj — P(Yj\Cj)), B = Vr=i B i v VyLi C/j it holds that sup(G|P) > 0. 

Here the gain is G|B, a conditional random variable itself. Conditioning on 
B has the meaning of considering only those values for G when at least one of 
Bi ,..., B n , Ci ,..., C m is true. Property © generalises to 

(3) P(aX + fiY\B) = aP{X\B) + pP{Y\B). 

Coherence concepts for conditional imprecise previsions were given by Walley 
[26] , see Section [3~3l But the earliest proposal was that of Williams [29] in 1975. 
His work had a limited diffusion in those years, but influenced Walley’s work and 
contained in nuce several fundamental results in the theory of imprecise probabili¬ 
ties [231 . 
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2.2. W-coherence and Other Preliminaries. In a conditional environment, we 
adopt the following generalisation of Definition[2]to define a coherent lower prevision 
£(■ 10 : 

Definition 4. P : T> — > R is a coherent conditional lower prevision on V iff, 
for all n £ N, MXq\Bq, ..., X n \B n £ V, V so, s±,..., s n real and non-negative, 
defining B = V?= o and G = £” = i -P(Xi\Bi)) - s 0 B 0 (X 0 -P(X 0 \B 0 )), 

sup(G|P) > 0. 

It is easy to realise that we would get an equivalent definition (adopted in (27]) 
by replacing G\B with G\S, where the support S is defined as S = \/{Bi : Si 
0, i = 0,..., n}. 

Throughout the paper, Definition [4] will be referred to as Williams coherence, 
or W-coherence or simply coherence when unambiguous, but as we will explain in 
Section l34~l it is actually a structure-free version of the original Williams coherence. 

A weaker notion than W-coherence is that of lower prevision that avoids uniform 
loss EH, recalled in Section 14.21 In the unconditional environment it is termed 
condition of avoiding sure loss and is defined in [26], Section 2.4.4 (a). 

A further consistency notion, centered convexity da mi nsi, is weaker than co¬ 
herence, but sufficiently stronger than the conditions of avoiding sure or uniform 
loss to allow for interesting properties and applications (for instance, in risk mea¬ 
surement [I^). Its relationship with W-coherence is discussed in Section l4~3l 

Formally, the definition of convex lower prevision is obtained from Definition [2] 
and Definition |4] by introducing just the extra convexity constraint X)"=i s * = s o (> 
0) and eventually by further imposing (this is not restrictive) that so = 1 |l3l Il4j . 
Again, we could equivalently condition G on its support S rather than on B, as 
done in mm- Centered convexity requires in addition that (0 £V and) P(0) = 0 
in the unconditional case, and further that VX |B £ V, 0 |B £ T> and P(0|B) = 0 in 
the conditional case (cf. Definition HOD . Centering is quite a natural requirement: 
non-centered convex previsions have rather weak consistency properties (see also 
Footnote 0 , but special instances of them may be found in the risk literature (cf. 

[HD- 

Let P be a lower prevision defined on an arbitrary set D. Following B) of Section 
E0 any consistency condition satisfied by P should guarantee that there exists an 
extension of P satisfying the same condition on any D' D T>. If such an extension is 
not unique, its vaguest or least-committal one, if existing, has a special importance. 
This peculiar extension is the natural extension E_ in the case of coherent or, when 
conditioning, W-coherent previsions [25,; i26], the convex natural extension P c for 
centered convex (unconditional or conditional) previsions [HUH]. The natural or 
convex natural extensions always exist for these consistency notions, not necessarily 
with other ones, like Walley coherence in [26] . Section 7.1.4 (b), or non-centered 
convexity. 


3. Coherence Concepts of Williams and Others 

3.1. About Williams’ Definition. Williams’ original definition ([25], Definition 
1) differs formally from our definition of W-coherence. One reason is that it refers 
to upper rather than lower previsions, but this is unimportant, since using the 
conjugacy relation P(—X\B) = —P_{X\B) our condition sup(G|P) > 0 corresponds 
exactly to his inequality in (A*) of [29]. The true difference is that his notion is 
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not completely structure-free, as it asks in particular that, for every conditioning 
event B, the set Xb = {X : X\B £ V} is a linear space. It follows for instance that 
Williams’ definition does not formally generalise Walley coherence for unconditional 
previsions (our Definition [2]) , which is structure-free: when B = fl for all X\B £ V , 
the set of all X is constrained to form a linear space Xq . On the contrary, Definition 
[4] is in particular a generalisation of Walley’s unconditional coherence and appears 
to be, in general, nimbler. The fundamental link between the two versions of 
Williams coherence is ensured by the following extension theorem. 

Proposition 1. If P : T> -A K. is W-coherent on T> (according to Definition^, it 
has a W-coherent extension on any P'dD. 

Although we are not aware of any published proof for this proposition, never¬ 
theless it should be regarded as essentially known. In fact, it can be proven by 
adapting the proofs concerning the convex natural extension in El. thus prov¬ 
ing that there always exists the natural extension of a W-coherent lower prevision 
on any V D V. A proof of this kind is given in the Appendix, for the sake of 
completeness. Alternatively, the historically older scheme of de Finetti’s extension 
theorem can be followed, with suitable (but basically minor) modifications. After 
de Finetti’s path-breaking proof concerning precise (unconditional) previsions in 
[7J, this scheme was employed in several generalisations (see e.g. [l]|4j[9]). In the 
version for W-coherence, its two-step proof shows in the first step that there exist 
W-coherent extensions on V = T>V_){X\B}, \/X\B , while the second step generalises 
the proof to any T>' using Zorn’s lemma or equivalent results. A by-product of the 
first step is that the set of admissible W-coherent extensions on X\B is proved to 
be a closed interval. Its lower endpoint is the natural extension EfX\B ), while 
the upper endpoint is the upper extension U_(X\B) of P. Thus, the scheme of de 
Finetti’s extension theorem does not emphasise the role of the natural extension, 
but rather treats the natural and upper extension in a symmetric way. 

As an important implication of Proposition |T] in our framework, when D in 
Definition Q] does not meet the structure requirements in Williams’ definition it is 
always possible to coherently extend P on a set V such that these requirements 
hold, and there the two notions of coherence coincide. It follows that W-coherent 
lower previsions have all the properties established for Williams coherence in [29], 
including the important envelope theorem, stating that P is coherent on D if and 
only if 

P{X\B) = ^inf, P(X\B),\/X\B £ V 

where V is a set of dF-coherent precise previsions P(-|-) dominating P(-|-) on V 
(VP £ V, P(X\B) > P(X\B)yX\B £ V). Note that inf is attained. 

3.2. From Unconditional to Conditional Coherence. As we have already 
pointed out, Definition 0] of W-coherence generalises Walley coherence for uncon¬ 
ditional previsions (Definition [2]). But other known definitions are equivalent to 
Definition [2] An interesting issue is therefore: why not rather generalise them in a 
conditional environment? An answer is that Definition [2] seems more appropriate 
for further generalisations. 

The matter is relatively simple and well known if we consider a version of co¬ 
herence, equivalent to Definition [2] obtained by restricting the stakes sq, ... ,s n to 
be integers (this is Walley’s Definition 2.5.1 in [2B]). The constraint on the integer 
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stakes can be adopted in a conditional environment too, for W-coherence as well as 
for some other consistency notions we discuss in this paper, obtaining equivalent 
formulations. However, considering integer combinations only is not enough when 
the random variables are unbounded, even in the unconditional case, as shown in 
[22]. We are not dealing with unbounded random quantities here, yet in view of 
(potentially) pursuing the utmost generality, we prefer not to impose the integer 
stakes constraint. 

The situation is more complex, and definitely less explored, when turning to the 
following less used definition, which is known to be equivalent to Definition [5] 

Definition 5. P_ : V —> R is a coherent lower prevision on T> iff, for all n £ N, 
V A 0 , Xi ,..., X n £ V, V n ,..., r n > 0, V /i 0 G R such that X 0 > ]T" =1 n x i + Mo, 
it holds that P[X o) > XoLi r iEi x i) + Mo- 

Definition [5] has a curious story: not mentioned explicitly in [261 , although fol¬ 
lowing directly from results established there, it appears in [2], but without being 
related to coherence for imprecise previsions, which was later done in [10] . To the 
best of our knowledge, Definition [5] has not been given a clear behavioural interpre¬ 
tation yet, nor has its potential generalisation to a conditional environment been 
explored. We tackle these issues in this section. 

As a first step, we rewrite the condition in Definition [5] that is, 

n n 

(4) A’o > 'y ' qXj + => P(Ao) > ^ ' riPfXi) + /io, 

i=l i=1 

in an equivalent form. Multiply for this the inequalities in by s 0 > 0, let 
Si = riSo (i = 1,... ,n), Ao = Poso and perform the infimum in the first inequality 
to obtain: 

n n 

(5) Ao < inf (s 0 A’o - ^ s^) =}► A 0 < s 0 P(X 0 ) - ^ s i P(A i ). 

i=l i= 1 

If we define I = -s 0 X 0 + Ya=i s i x ^ E = ~ s oP( x o) + Y!i=i s iP( x i), ® is 
rewritten as 

(6) Ao < inf(— I) =» Ao < —E. 

Let us now come to the behavioural interpretation of Definition [5] For any given 
bet on Ao,..., X n with stakes so,..., s n , I is the bettor’s overall income ensuing 
from the bet, while E is her/his expense for betting. Note that I is random, 
while E is not. From ®, Definition [5] asks as a necessary and sufficient condition 
for coherence that inf(— I) < —E, i.e. that sup(/) > E , for any bet. This is 
a reasonable requirement: it does not hold iff sup I < E for some bet, and this 
means that a specific bet can be arranged whose ensuing gain G = I — E is strictly 
negative and bounded away from zero whatever happens, and the bettor suffers 
from a sure loss. It is clear then that Definitions [2] and [5] are equivalent: they both 
require that no bet must be such that supG < 0. 

The above interpretation also suggests a way to explore extensions of Definition 
[5] in a conditional framework. Rewrite for this the gain G in Definition [4] high¬ 
lighting the expense and income terms. We have I = —sqBqXo + XwLi s iBiXi , 
E = —SoB 0 P(X 0 \B 0 ) + Xw =1 s iBiP(Xi\Bi) and the condition sup(G|R) > 0 in 
Definition 2] is written as 

(7) 


sup(I-E\B) > 0. 


WILLIAMS COHERENCE AND BEYOND 


9 


The following Proposition is fundamental for discussing the potential generalisa¬ 
tions of Definition [5] 

Proposition 2. Consider, as in Definition 0 a bet on Xq\Bq, ..., X n \B n with 
stakes sq, ..., s n , respectively, and define B = VT=o • Let Ao be any real number. 

a) The following condition implies condition 

(8) A 0 < inf(-/|B) => A 0 < mf(-E\B). 

b) Condition 0 implies that 

(9) A 0 < inf(— I\B) => A 0 < sup(— E\B) 

Proof. a) Let © hold, and take Ao = inf(—7|7?). Then inf(— E\B) > inf(— I\B) 
— swp(I\B), that is inf(— E\B) + sup(/|S) > 0. We obtain from this 
0 < sup(inf(— E\B) + I\B) < sup(7 — E\B), which is dTJ). 
b) Let 0 hold. We obtain, assuming that Ao < inf(— I\B) at the third inequal¬ 
ity, 0 < swp(I—E\B) < sup(/|B)+sup(— E\B) = sup(— E\B)— inf(— I\B) < 
sup(— E\B^ — Ao- Hence Ao < sup(— E\B), so that (El) holds. □ 

When Bq = ■ ■ ■ = B n = D, i.e. when we consider a bet on unconditional random 
variables only, both (0 and 0 reduce to 0. As a by-product, we reobtain the 
known result that Definitions [2] and [5] are equivalent. 

A comparison of conditions 0, 0 and 0 reveals that the expense E is random 
in a conditional environment', it depends on the outcomes of Bq, ..., B n which 
(apart from those Bi = f2, if any) are unknown to the bettor at the betting time. 
This fact appears to be the real difficulty in trying to extend Definition [5] to a 
conditional form: we actually get two versions, 0 and 0, with weaker properties. 
Condition 0 is potentially useful to disprove W-coherence: if it does not hold 
for some bet, the given P(-|-) is not W-coherent. Condition 0 is sufficient for 
W-coherence, when holding for any bet. A condition slightly simpler than 0 may 
be used for the same purpose under an additional constraint, as follows 

Proposition 3. Consider a bet in Definition^ such that A™ =0 Pi 0. The following 

condition implies condition 0-' 

n 

(10) s 0 P(X 0 \B 0 ) -Y.SiE.iXilBi) > sup(-J|B). 

i=t 

Proof. We equivalently prove that if 0 does not hold, then p* = sqP{Xo\Bo) — 
Yl'i =i SiP(Xi\Bi ) < sup(— I\B). Noting for this that A™ =0 Bi 0 ensures that p* 
is a possible value for — E\B and hence p* £ [inf(— E\B), sup(— E\B)], we get 0 > 
sup(7 - E\B) = sup (-E - {-I)\B) > sup{-E\B) - sup(-7|P) > p* - sup(-/|B), 
from which p* < sup(— I\B) follows. □ 

To ensure W-coherence using m, it is necessary that A" =0 Hj 0, for any bet. 
A relevant special case which obeys this constraint is that of the conditioning events 
in D forming a monotone (or nested) family, i.e. they can be totally ordered by 
implication (or inclusion, in the set-theoretic approach). 

Summing up, it does not seem possible to generalise Definition [5] while condition¬ 
ing. This should be ascribed to the nature of the term representing the ‘expense’ 
in the gain decomposition, which is generally random outside the unconditional 
framework. 
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3.3. Alternative Concepts of Coherence. A further issue is that a number of 
different generalisations of coherence (Definition [2] or equivalent) to a conditional 
framework have been proposed in [25 1: how do they relate to W-coherence? We 
discuss some basic facts about this relationship in this section and the next one. A 
further discussion of Walley’s criticism on Williams coherence needs some prelimi¬ 
naries on the concept of avoiding uniform loss, and is therefore presented in Section 

roi 

The coherence concepts defined in Walley’s book [25] include: separate coherence , 
which is the first coherence notion in a conditional framework, presented in Section 
6.2.2, coherence with unconditional previsions (Section 6.3.2), which is generalised 
to coherence in Section 7.1.4 (b), and weak coherence, defined in Section 7.1.4 (a). 
Coherence as defined in Section 7.1.4 (b) is the prevailing concept in [25], and will 
be referred to as Walley-coherence here. 

None of these concepts is structure-free: a common feature is that the condition¬ 
ing events must belong to some partition and every (non-impossible) event B in the 
partition is a conditioning event for some X\B £ T>. Precisely, just one partition 
is employed in the case of separate coherence (cf. Definition [Gj , a finite number of 
partitions are used with Walley-coherence or weak coherence, two partitions (one of 
which is the trivial partition Bo = {fl}, he. it corresponds to unconditional random 
variables) in the case of coherence with unconditional previsions. The reason for 
this kind of constraint lies in Walley’s requirement for conglomerability, a concept 
discussed in the next section which is itself not structure-free. There are also other 
constraints, see e.g. Section 6.3.1, which are less fundamental, in the sense that 
several of them are made to simplify the theory but could be removed; m is a 
paper in this direction. 

It ensues that the discussion of, say, Walley-coherence of assignments on rela¬ 
tively simple domains, like V = {Xi\Bi, X 2 \B 2 , X 3 \(Bi A ( B 2 V B 3 ))}, cannot be 
performed unless these domains are embedded in larger ones, satisfying the con¬ 
straints in [26| (this operation could be not simple, it may require some extension 
theorem which is not always available for Walley-coherence). 

Because of these features, Walley’s notions of coherence are not always compara¬ 
ble with W-coherence: there are domains where these notions are not defined, while 
W-coherence always is. When making comparisons, we must consider W-coherence 
only on those domains T> which obey the constraints of the coherence notion it is 
compared with. When this is done, W-coherence is equivalent to: 

a) separate coherence (this is proven in Proposition 0 below); 

b) Walley-coherence, with the extra assumption that all partitions £>,; of condi¬ 
tioning events in that definition are finite (this equivalence is stated without 
proof in [26)1: without this assumption, W-coherence is more general than 
Walley-coherence. 

As for coherence with unconditional previsions, it is a special case of Walley- 
coherence. Concerning weak coherence, it is implied by Walley-coherence but its 
importance seems essentially instrumental in the theory in [26] . Useful results for in¬ 
terpreting the conceptual difference between weak coherence and Walley-coherence 
were recently given in [12) . 

Separate coherence has an important role in [26] . as it is a prerequisite for the 
other kinds of coherence. We are going to prove now its equivalence with W- 
coherence. We first state a preliminary result, which is of some interest in itself, as 
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it simplifies checking W-coherence of P : T> —> R if the conditioning events of all 
X\B £ V have a special separation structure. 

Proposition 4. Given P : T> —> M, let C be a partition and suppose that, for any 
X\B £ V, B implies some event in C. Define VG £ C, T>c = {A|P £ P : P => C}. 
If P is W-coherent on each Vq, then it is W-coherent on V. 

Proof. The assumptions imply that D = (J c^c^c, and tluat a generic gain G in 
Definition Q] may be written as follows, emphasising that distinct random variables 
may have the same conditioning event: G = s ijBfiXij — P_{X i: j m) - 

SqBo{X 0 — P.(X 0 \B 0 )). 

Now take, say, B\ and suppose B\ =>■ C\ £ C. Then obviously sup(G|P) > 
sup(G| Vs i= >Ci B i), where Vs i= >Ci B i (£ 0, at least Bi =>■ Gi) sums those Bi 
among Bq,B 3 , ... ,B n that imply C\. But G| V_B i= ^Ci B i the conditional gain 
of a bet on (some) elements of T>c i only, because those (and only those) X i3 \B, 
(or possibly A'o|Po) which are not in 'Dc 1 are filtered out, when conditioning on 
Vs i= ^Ci Bi, by their indicators Bi (or Bf) which all take value zero. (For in¬ 
stance, if V Bi ^ Cl B i = B x V B 3 , G\Bi V B 3 = £"=i + 

Yl’jL i s 3 jB 3 {X 3 j — P(X 3 j))\Bi V B 3 ). It follows from W-coherence of P on T>a 
that sup(G| V_B i= s>Ci B i) V 0, hence also sup(G|P) >0. □ 

Remark. We may replace ‘W-coherent’ with ‘dF-coherent’ in Proposition 
getting another true proposition. This is because the preceding proof relies essen¬ 
tially on the structure of V. □ 

In the sequel we shall apply Proposition [4] in the special case that the events B 
themselves form partition C. Let now B be an arbitrary (finite or not) partition of 
non-impossible events. 

Definition 6. The conditional lower previsions P_b( x \B), defined for any B £ B 
and X £ Tl{B), where TL{B) is an arbitrary set of random variables containing B, 
are separately coherent iff, for every B £ B, 

i) P b (B\B) = 1 

ii) Vs 0 , ■ • •, s n > 0, VA 0 ,..., X n £ U{B), defining G = £? =1 sfiXi-P^X.^B)) 
So(A 0 — P_ b (Xq\B)), it holds that supG > 00 

Define now the conditional lower prevision P such that P(X\B) = P_ b (X\B), 
VP £ B, VA £ TL(B) (P is the collection of all P B ). 

Proposition 5. The lower previsions P B (B £ B) in Definition^ 3 are separately 
coherent iff P is W-coherent on V = UbgbDb, where V B = {AT|P : X £ P(P)}. 

Proof. We prove first that W-coherence implies separate coherence. If P is W- 
coherent, i) trivially holds. With regard to ii), it follows from 

sup G = maxjsupG, sup G} > sup G = sup(G|P) = sup(PG|P) > 0, 

B B c B 

the last equality holding by ©, the inequality by W-coherence. 

To prove the converse implication, suppose that separate coherence holds. Bet¬ 
ting on B, A 0 ,..., X n £ TL(B), it follows then sup(s(P - P(P|P)) + ^"=i s i( x i - 
P{Xi\B)) - s 0 (X 0 - P(X 0 \B))) = sup (s(P - 1) + G) 

max(sup B (s(P — 1) + G), sup B c (s(P — 1) + G)) > 0. 


’’TIi is is the definition in m, after replacing integer stakes with real non-negative ones. 
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If we choose s > max(sup BC G, 0), the last inequality implies sup b (s(B — 1) + 
G) > 0, since then sup b <={s(B — 1) + G) = —s + sup B = G < 0. Using also (JT)) , 
sup b (s(B - 1) + G) = sup(G|P) = sup(PG|P) = sup(^" =1 SiB(Xi - P{Xi\B )) - 
sqB^Xq — P_(Xq\B))\B) > 0, which means, given the arbitrariness of n, Xo,..., X n 
and So, ■ ■ ■ ,s n >0, that P_ is W-coherent on T>b ■ Then W-coherence of P on each 
T>b implies W-coherence of P on V, because of Proposition [|] (where C, T>c are 
now B,T>b respectively). □ 

W-coherence and Walley-coherence are equivalent (cf. b) above) when the par¬ 
titions Bi of conditioning events in Walley-coherence are all finite. In general, 
properties of W-coherence involving only finitely many distinct conditioning events 
hold for Walley-coherence too (a W-coherent assessment or possibly one of its W- 
coherent extensions, cf. Proposition [1] may be referred in this case to a finite set 
of finite partitions Bi). For instance, several product or sign rules are discussed in 
m using W-coherence, but they hold with Walley-coherence too. One such rule 
is that, if P is W-coherent on V D { AX\B , A\B , X\A A B} and P_(X \ A A B) > 0, 
then P{AX\B) > P{A\B) ■ P{X\A A B). 

In general, W-coherence has the advantage over Walley-coherence that it ver¬ 
ifies properties A), B), D) in Section [2.11 while none of them necessarily holds 
with Walley-coherence. Property D) allows also a sensitivity analysis interpreta¬ 
tion of W-coherence. W-coherence is not necessarily conglomerative, while Walley- 
coherence is. This is a basic difference, and we comment on it in the next Section 

m 

Last but not least, we note that the notion of conditional random variable (and 
of conditional event) is often left at an informal level in the literature, including 
pill 2iJj . A formal approach to these and other descriptive tools of uncertainty, only 
sketched in Section [2j is developed in M- 

Although the way conditional random variables or events are interpreted is seem¬ 
ingly not particularly relevant in many matters, a greater formalisation turns out 
to be useful with other ones. For an example, consider Lemma 6.2.4 in [26): this 
lemma states that, if BX = BY and the separate coherence conditions i), ii) of 
Definition [6] hold for a lower prevision P(-|P), then P_{X\B) = P(Y\B). The re¬ 
sult depends on the interpretation of conditional lower previsions in [26], which 
does not formally define conditional random variables. But using the approach 
outlined in Section [2] and in particular m with n = 2, X\ = P, Xi = X, 
f(B,X) = BX , and since B\B (the indicator of event B given that B is true) 
takes value 1, we get BX\B = (B\B) ■ (X\B) = X\B, thus condition BX = BY 
alone implies X\B = Y\B. Consequently we achieve the more general result that 
fi(X\B) = n(Y\B) whatever the uncertainty measure fj, is, not because of coherence 
(n could even be incoherent), but merely because we are evaluating the same thing. 

3.4. The Issue of Non-Conglomerability. Suppose that an uncertainty mea¬ 
sure fi is given on a domain T> which includes a random variable X and the condi¬ 
tional random variables X\B, for all B in a given partition B. Then /.t is conglom- 
erable (with respect to X and B) iff 

(11) inf n{X\B) < n{X) < sup n(X\B) 

B&B b^B 

while n is non-conglomerable if m does not hold. In words, m requires n(X) to 
belong to the smallest interval containing all conditional evaluations fi(X\B). 
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When A' is (the indicator of) an event and /i is a precise probability P , conglom- 
erability may seem an obvious property at first sight, and in fact it holds trivially 
if the partition B is finite. When B is infinite, the matter is however much more 
complicated [ 181 . 

It was de Finetti who discovered in his 1930 paper [ 6 j that dF-coherent prob¬ 
abilities may be non-conglomerable, presenting two nice examples supporting this 
seemingly counterintuitive fact. His examples were forerunning the theory, as Def¬ 
inition [2] was not known at those times. We reconsider now one of such examples, 
showing that the probability it uses is actually dF-coherent. 

Example A number is chosen at random from the set N + of positive integers. 
Define oj n = l n is chosen’, and term Bo the partition of all n £ N + . 

If A is the event that an odd number is chosen, clearly P(A) = |. Defining 
B n = u) 2 n-i Vw 4„_2 V 0 J 4 n , Vn £ N + , B ± t 2 = {B i,..., B n ,...} is a partition coarser 
than £>o, and P(A\B n ) = |, Vn (any B n says that either one odd number or two 
even ones are selected, so B\ = ‘1, 2 or 4 is chosen’, etc.). It ensues that m does 
not hold, and P is non-conglomerable. 

The example is easily generalised, as noted in [ 6 ], replacing B 1,2 with the partition 
Bh,k such that each of its events B [,..., B' n ,... implies that one out of h+k numbers 
is chosen, h numbers being odd, k even. Then P(A\B' n ) = 7 ^ I = P(A), if 

h 7 ^ k: P is non-conglomerable. 

To prove that P is dF-coherent on D = { A , A\B [,..., A\B ' n ,...}, note that all 
possible gains in Definition Q] are of two disjoint types, according to whether they 
include (a bet on) A or not. For those who do not sup(G| B[.) > 0, applying 
the remark following Proposition [4] (here T>c = {A| B' n }, P is dF-coherent on T>c 
since jyfiy £ [0,1]). A generic G including A may be written, in a way shorter but 
equivalent to that of Definition [I] as G = s(A — \) + Y^j=i — 7^)1 where 

s, si,..., s r may take any real value. Among those events ui n of partition Bo such 
that uj n A (SI V ... V B'i ) = 0, there are some implying A, while others imply A c . 
If uj n => A, G(oj n ) = |s, when oj n => A c , G(co n ) = —\s. In all cases, maxG > 0. 

□ 

As this example shows, there may be instances where quite natural uncertainty 
evaluations are consistent, but non-conglomerable. We believe that in principle 
non-conglomerability should not be ruled out a priori. 

The issue of non-conglomerability is a root difference between Williams’ and 
Walley’s approaches to conditional coherence. Williams, following de Finetti, does 
not require conglomerability. Thus, for instance, the probability P in the example 
is a special case of W-coherent prevision. 

Walley asks for conglomerability in the consistency concepts, other than separate 
coherence, he develops in a conditional framework. These concepts should comply 
with a conglomerative principle ([26], Section 6.3.3); technically, his consistency 
notions implement this principle by including terms like G(X\B) = X1_BeB -®(X — 
P_(X\B)) in the expressions of the gainsQ. These terms are well-defined also when 
B is infinite, because the factors B are the indicators of events in a partition B. 
Thus only one of them is non-null, whatever happens, and hence the summation is 
always made up of a single term. Conglomerability implies then various conditions, 
similar to m ([26]. Section 6.5). In the case of Walley-coherent precise previsions, 


^We shall meet one such term in Section 14.2.1 1 equation J13D . 
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it implies axiom (C14) in [26], Section 6.5.7, i.e. 

(12) P(X)> M b P(X\B). 

Actually, it is proven in [26] that m is equivalent to Walley-coherence for pre¬ 
cise previsions, under certain structure constraints on T>. These constraints im¬ 
ply in particular that (X, X\B €£>)=>- (— X, — X\B £ V), a condition ensur¬ 
ing alone that (fT21) is equivalent to (fill) , since P(—X) > infsgg P(— X\B) iff 
P(X) < sup BeB (X|S). 

In particular, it ensues from this argument that the probability P in the example 
(technically, any of its dF-coherent extensions on a set V meeting the structure 
requirements of Walley-coherence) is not Walley-coherent. 

More generally, non-conglomerable dF-coherent conditional previsions are not 
Walley-coherent (they do not satisfy conglomerative conditions like (fT21) b Note that 
the term ‘linear prevision'' in [26| identifies dF-coherent previsions in the uncondi¬ 
tional environment (the first five chapters), but corresponds to those dF-coherent 
conditional previsions which are conglomerable in a conditional setting (see Section 
6.5.7 in [26]). 

The issue of conglomerability allows a more in-depth explanation of the dif¬ 
ferences between W-coherence and Walley-coherence. We pinpoint the following 
items: 

a) If we wish that an uncertainty measure /i is conglomerable, some constraints 
must be imposed on its domain T >, as appears already from (1111) : if X\B £ 
V , then it must hold that X\B' £ V MB' in some partition including B. 
In particular, this or analogous constraints seem unavoidable in Walley- 
coherence, which is necessarily not structure-free. 

b) Walley’s approach may be interpreted as a thorough investigation of con¬ 
glomerable imprecise previsions. It can be adopted, if one feels that im¬ 
posing conglomerability does not rule out some significant models in the 
specific uncertain situation being investigated. 

c) Conglomerable imprecise previsions have some additional properties, ensu¬ 
ing from inequalities like m m, which are helpful in several derivations 
and problems. The disadvantage is that they do not always ensure that 
the envelope theorem holds, or that there exists a conglomerable natural 
extension. 


4. Beyond Williams Coherence 

We explore in this section how W-coherence relates to other consistency concepts, 
either stronger (Section ffTl) or weaker (Sections 14.2114.3D . 

4.1. Between Williams’ and de Finetti’s Coherence? As well-known, coher¬ 
ence for lower previsions (Definition [2]) may be obtained formally from dF-coherence 
(Definition [T|| by restricting the number of bets ‘against’ some X £ V (uncon¬ 
strained with dF-coherence) to m < 1. The same constraint distinguishes, in a 
conditional framework, W-coherence (Definition U) from dF-coherence (Definition 
©: with W-coherence we can bet against (at most) one Xo|i?o £ T>. 

A natural question is then: what if we relax this constraint, for instance asking 
- to keep the relaxation at its minimum - that we can bet ‘against’ at most two 
X\B £ X>? Shall we obtain a significant concept of coherence, intermediate between 
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W-coherence and dF-coherence? The answer is essentially negative, even in an 
unconditional environment. For simplicity, we illustrate this case only. 

Definition 7. P _: V —> K. is a bi-coherent lower prevision on T> iff, for all n £ N, 
V Xi ,..., X n , Y\, Y 2 £ D, V si,..., s n , n, r 2 reaZ and non-negative, defining G = 
E?=i *i( A i - £( A 0) - ri(Yi - £(Ti)) - r 2 (F 2 - P(Y 2 )), sup G > 0 . 

Clearly, any bi-coherent lower prevision satisfies Definition [2] as well and is there¬ 
fore coherent. It also avoids sure loss ([20) Section 2.4.4 (a)), like (as well known) 
any coherent lower prevision. Further 

Proposition 6 . Let P: T> —> M be bi-coherent. 

a) If X,Y, X + Y £ V, then P{X + Y) = P(X) + P(Y). 

b) If X,aX £ V (a £ R), then P(aX) = aP_{X). 

Proof. To prove a), first observe that the coherence of P implies P (X + Y) > 
P(A) + P(Y) (20, Section 2.6.1 (e)). For the reverse inequality, put n = 1, 
Xi = X + Y, Y\ = X, Y 2 = Y, s 1 =r 1 =r 2 = l in Definition [7] 

When a > 0, b) follows from the coherence of P (20, Section 2.6.1 (f)). Let us 
suppose a < 0. Putting n = 2, A'i = X , X 2 = aX, s 1 = —a, s 2 = 1, r± = r 2 = 0 in 
the gain in Definition [7J we get PfaX) < aP_(X). The opposite inequality follows 
putting n = 0, Y± = X , Y 2 = aX, n = —a, r 2 = 1. □ 

Proposition [S] emphasises that any bi-coherent lower prevision is linear and ho¬ 
mogenous on a large enough domain, i. e. it behaves essentially like a dF-coherent 
prevision (cf. ©). Actually, any bi-coherent lower prevision is dF-coherent, when 
the domain on which it is defined is sufficiently rich, as the following corollary of 
Proposition [6] points out. 

Corollary 1. Let P_ : V —> R be bi-coherent. If either —X £ V MX £ T> or 
X + Y £ T> MX, Y £ T>, then P is dF-coherent. 

Proof. Let —X £ V MX £ V. Since P avoids sure loss, and P(X) = —Pf—X) 
MX £ T> by Proposition [ 6 ] b), dF-coherence of P follows at once from Theorem 2.8.2 
in |20. Let now X + Y £ D MX, Y £ V. Since P is coherent, P(X) > P(Y) + /i, 
MX,Y £ V such that X > Y + n ([26], Section 2.6.1 (d)). Besides, property a) in 
Proposition[ 6 ]holds. This implies dF-coherence of P by Theorem 2.8.3 in [20 . □ 

Nevertheless, a bi-coherent P is not necessarily dF-coherent, when the domain 
of P does not satisfy the closure properties of Corollary [U as illustrated by the 
following simple example. 

Example Let B = { 0 ) 1 , 002 , 013 } be a partition and P the vacuous coherent lower 
prevision on B: Pfooi) = 0 (i = 1, 2,3). Actually, P is bi-coherent as well. To show 
this, we prove that the supremum of any gain in Definition [7] is non-negative. It is 
sufficient to inspect only the gains of the form Gi = Si(oJi — P_(o)i)) — Sj(o)j — Pffjj)) — 
Skiook P(^fc)) — SiOJi SkOJk {i 7 ^ j 7 ^ h 7 ^ i,i, j, k £ [l,2,3},S2,Sj,Sfc k 0 ), 

since the non-negativity of the supremum of any other kind of gain in Definition [7] 
is implied by the coherence of P. Clearly, Gi(o)i) = Si > 0 (i = 1, 2,3), hence P is 
bi-coherent, although, patently, P is not dF-coherent. □ 

We note incidentally that the vacuous lower prevision is not always bi-cohe- 
rent, not even on partitions: if the partition in the example were B' = {oo\ ,w 2 }, 
then sup G < 0 in Definition [7] when G = —oo\ — oo 2 = —1 (i.e. when n = 0, 
Yj = ooi, n = l, i = 1,2). This also shows that coherence and bi-coherence are not 
equivalent, when bi-coherence may differ from dF-coherence. 
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Those bi-coherent previsions which are not dF-coherent on T> do not satisfy 
property B) in Section ROI i.e. they do not ensure bi-coherent extensions on any 
superset V D V. This is shown by the following corollary. 

Corollary 2. Let P : T> —»• R be bi-coherent and let C be any linear space that 
contains T>. Then P can be bi-coherently extended on C if and only if P is dF- 
coherent on V. 

Proof. The ‘if’ part follows from the extension theorem for dF-coherent previsions, 
the ‘only if’ part from Corollary [T] (implying that any bi-coherent extension of P 
on C is dF-coherent on £, hence also on V C £). □ 

Thus, for instance, the lower prevision presented in the previous example cannot 
be bi-coherently extended to the set of random variables defined on the partition 
B. Corollary [2] could be further generalised: there are instances of bi-coherent, 
but not dF-coherent, lower previsions that cannot be bi-coherently extended on 
supersets which are not even linear spaces. The important message to convey is 
anyway already clear: bi-coherence is not particularly significant, because either it 
coincides with dF-coherence or, when it can differ from dF-colrerence, property B) 
of Section [2.II may not hold, not even in rather common situations. 

4.2. The Condition of Avoiding Uniform Loss. In the unconditional case, the 
most studied consistency condition weaker than coherence (Definition [5]) is that of 
avoiding sure loss, obtained formally from Definition [2] putting Sq = 0. With W- 
coherence, the corresponding weaker notion is the following 

Definition 8. P : V —» R avoids uniform loss (AUL) iff, for all n £ N + , 
V A'i|-Bi,..., X n \B n £ V, V S\, ... ,s n real and non-negative, defining B = V™ =1 Pi 
and G = Yli =i SiBi{Xi — P(Xi)), it holds that sup(G|P) > 0. 

The notion of avoiding uniform loss was used in mi where other equivalent 
characterisations are supplied. When P = P = P, P avoids uniform loss if and only 
if P is dF-coherent. Clearly, W-coherence of P implies that P avoids uniform loss. 
The AUL condition is generally too weak, as appears already at the unconditional 
level (cf. [26] . Section 2.5). A more satisfactory notion is that of centered convexity 
(cf. Section fOl) . 

In this section we explore the relationship between the AUL condition and a sim¬ 
ilar notion introduced in [261 . and reconsider an example on W-coherence discussed 
in [5B] in the light of this. 

4.2.1. Walley’s Condition of Avoiding Sure Loss. Given a partition B and two arbi¬ 
trary sets TL, K, of unconditional random variables, such that 0 £ TL, B £ Ti VP £ B , 
suppose throughout this section that V has the following special structure: V = 
K, U UsgB P B i where D B = {Y\B : Y £ TL}. 

Definition 9. Let P _: T> —> R be such that 

a) the restriction of P on K. is a(n unconditional) coherent lower prevision; 

b) the restrictions of P on each T>b, B £ B, are separately coherent. 

Then P avoids sure loss on V iff, for all m, n £ N, V X \,..., X m £ 1C, V Yi,..., 1 r n £ 
TL,y Sj > 0 (j = 1,..., m), Vii>0(i = l,...,n), 

m n 

(13) sup(^ sjiXj - P(Xj)) + £ B(Y t - P(Yi\B ))) > 0. 

j=1 i=l Be B 
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Discussion. Definition [9] is Walley’s Definition 6.3.2 of avoiding sure loss in 
[26] ; here the following assumptions!! are introduced, without altering Definition 
6.3.2: 

i) the non-negative coefficients Sj , tj are real, but not necessarily integers; 

ii) we do not require T~L , /C to be linear spaces, unlike condition (a) in [26], 
Section 6.3.1, and modified correspondingly Definition 6.3.2, as indicated 
at the end of Section 6.3.1. 

An interesting remark is that Definition [3 is formally no extension of the condi¬ 
tion of avoiding sure loss for unconditional previsions (Definition 2.4.4 (a) in |26j): 
if /C = 0 and B = {D}, it reduces to the notion of coherence (Definition [2]). This 
depends on assuming b) in Definition |9l ' The same remark applies to the concepts 
of avoiding sure, partial or uniform lost] defined in [26], chapter 7, since separate 
coherence is a prerequisite for them too. □ 

Proposition 7. If P_ avoids sure loss on D, it avoids uniform loss on T>. 

Proof. Given the special structure of D, all gains G in Definition [8] may be written 
as follows, 

m q ki 

(14) G = J2 Sj{Xj - P(Xj)) tirBi{Y ir - P(Y ir \Bi )), 

j =1 i= 1 r= 1 

with to, q > 0. Suppose If avoids sure loss, and consider the following (exhaustive) 
cases. 

i) The second summation in m is zero (q = 0). Then supG > 0 follows 
from Definition [9] a). 

ii) The first summation in (fT4l) is zero (to = 0). Separate coherence of P on all 
T>b (Definition [9] b)) implies W-coherence of P on U beb^b (Proposition 
0, which implies that P avoids uniform loss on Uses hence sup G| V? =1 
Bi> 0. 

iii) m ■ q > 0. This implies sup(G|B) = sup(G|D) = supG in Definition [5] 
We can write G as a gain of the kind (fl3l) . since Bi(Y ir — PfiY ir \Bi)) = 
Eb,b B(BiYi r — P_(BiYi r \B)) (we used the fact that BBi = 0 if B ^ Bi, 
and that BiYi r \B = Bi\B ■ Y lr \B: consequently if B Bi , BiYi r \B = 
0|B, and P(BiYi r \B) = 0). Then G in (fill) is a gain of type (fT3l) from a 
bet on X\, ..., X m , and on the conditional random variables B\Yn\B ,..., 
B q Y q k q | B,\/ B £ B. Hence sup G > 0. 

In all cases, G satisfies the conditions in Definition [8] □ 

Hence, Definition 1 is stronger than Definition [8] when they are comparable. 
The key difference is that Definition [9] can be justified following a conglomerative 
principle (cf. [2B], Section 6.3.3) while Definition [8] does not rely on it. This fact is 


^Condition (b) of Section 6.3.1 in 1261 . i.e. Y C 'H => BY G H. VS € B, may be replaced 
by our assumptions on D. in particular by 0 G H. In fact, given any B, B* G B, we have that 
B*Y\B is equal to Y\B, by jTJ, when B* = B, while, when B* B, B*Y\B = 0|i? (G T>). 
Therefore, P_(B*Y\B) is defined VB, B* G B, which is what ensures condition (b) of Section 6.3.1 
in m ■ We did not mention condition (c) of Section 6.3.1 because it is unnecessary in the following 
derivations. 

^Note that the meaning of the term avoiding uniform loss in m is different from that used in 
this paper, following Definition \8\ 
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relevant in explaining some of Walley’s remarks on Williams coherence, as we shall 
now see. 

4.2.2. On the Consistency of Williams Coherence. A critical remark in [26] about 
Williams coherence is that it does not always satisfy Walley’s condition of avoiding 
sure loss. 

The important fact here is that if an agent adopts W-coherence, her/his reference 
minimal consistency concept should be Definition [8] of avoiding uniform loss, or 
equivalent. Referring to Definition [9] of avoiding sure loss would determine a kind 
of inconsistency: the agent requires (with the condition of avoiding sure loss) and 
does not require (with W-coherence) conglomerability at the same time. 

Keeping the concept of avoiding uniform loss as a reference, the criticism to W- 
coherence outlined in some examples in [25] does not apply. We discuss here one 
such example ([25], Section 6.6.6). 

Let B be a denumerable partition whose elements are indexed in the set Z — {0} 
of non-zero integers and call co z the generic element in B. Define two dF-coherent 
precise probabilities P + and P~ on B , as follows. P + (w z ) = 2~ z if z > 0, P + (w z ) = 
0, V z < 0, while P _ (w z ) = 0, Vz. Extend P + , P~ on B = V{ z <o} w z : dearly 
P + (B) = 0 (P + is e-additive), while the extension of P~ is not unique, and we 
may dF-coherently choose P~(B) = 1. The extensions on A = B c = V{ z >o} are 
then P~(A) = 0, P+(A) = 1. 

Define now P = p+pp . Since mixtures of dF-coherent probabilities are dF- 
coherent, P is dF-coherent. Let n G N + , and define B n = V uj n . Because 
P(B n ) = P(u n ) > 0, the extension of P on ui n \B n is uniquely determined by 
Bayes’ rule, and P(u> n \B n ) = 1. Similarly, P(A\B n ) = 1, while P(A) = Then 
P is a dF-coherent conditional probability on V = B U {B, A, B n , ui n \B n , A\B n }: 
this follows from the fact that coherent (conditional or not) probabilities can be 
dF-coherently extended on any event mm, and that the extension of P on w ra | B n 
and A\B n is unique. DF-coherence of P on 23 is equivalent to its avoiding uniform 
loss on 23, when viewing P as a special imprecise prevision m ■ Thus P does not 
incur uniform loss, but it is shown in [26] that it incurs sure loss (in the sense of 
Definition [5]). This is because P is non-conglomerable, and in fact it does not obey 
the conglomerability axiom m- 

Similar conclusions hold for other examples in 126] : inconsistencies arise only 
when conglomerability axioms are used in a hybrid way. Thus the very question in 
choosing between W-coherence and Walley-colrerence (when they do not coincide) 
seems to be a problem of imposing or not conglomerability. 

4.3. Centered Convexity. While modifications of the definition of W-coherence 
towards some notions intermediate between it and dF-coherence seem to yield no 
really significant results, the notion of centered convexity is intermediate between 
that of avoiding uniform loss and W-coherence and has interesting properties. 

Definition 10. P : 23 —>- R is a convex conditional lower prevision on 23 iff 
Vn G N + , VX 0 |Po, ■ • ■, X„ \B n G 23, Vsi,..., s n > 0 : Xa=i s * = 1; defining G = 
E?=i SiBi{Xi - P{Xi\Bi)) - B 0 (Xo - P{X 0 \B 0 )), sup{G| V/ =0 B t } > 0. FuHher, 
P_ is centered if besides 0|B G 23 and P(0|B) = 0, \!X\B G 23. 

The theory of centered convex previsions was developed in tnunum, gener¬ 
alising under many respects the theory of W-coherence. These previsions satisfy 
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the properties A), B) and C) from Section 2.1, and operationally correspond to the 
important notion of convex risk measure. 

Property D) in Section 12.11 is the only one, among those stressed in this paper, 
where W-coherence still has a definite advantage over centered convexity, at the 
current state of art. In the rest of this section, we give some explanation of this fact. 
The material is derived from [15l . where the interested reader may find more details. 
We present here the simplest envelope theorem, whose proof requires preliminarily 
the following 

Proposition 8. Let V be a set of convex conditional lower previsions defined on 
V. If P(X\B) = infQ 6 -p {Q(X\B)} is finite \/X\B G D, P is convex on V. 

Proposition [5] generalises to convex conditional lower previsions a statement al¬ 
ready established for coherent |26i or convex unconditional m lower previsions Q 
The proof is similar to those in [13; 126] and is omitted. 

Notation Given D , let £ = {B : 3X\B G V}. □ 

Theorem 1. (Envelope Theorem) Let V be a set of dF-coherent precise previsions 
on T> U £ such that VP G V , P(P) > 0 VP G £, and let a : V —>• R be a real 
function. Then 

(15) P(X\B) = hT{P(A|P) + \/X\B G V 

is a convex conditional lower prevision on T>, whenever the infimum in m is finite. 
Further, P is centered iff infp g -p{ } = 0, VP G £ . 

Proof. We prove that VP G ?, Vq G R, P a = P(X\B) + -ppjy is convex. The main 
thesis of the theorem then follows from Proposition [HJ 

To prove that P_ a is a convex conditional lower prevision, we show that a generic 
G in Definition [TUI may be referred to P, after substituting P a (X\B) with P(X\B) + 
pjfi y, and hence its supremum is non-negative because P is dF-coherent. In fact, 
let A 0 |Po,..., X n \B n G V, Si,..., s n > 0 such that EILi = T Then G can be 
written as 

G = E”= i s iBi(Xi - P(Xi\Bi) - a/P(Bt)) - P 0 ( X 0 - P{X 0 \B 0 ) - a/P(P 0 )) = 
E?=i - P(X i \B i )) + E”=i Si(Bi V B 0 )(Zi - P^B, V P 0 )) - B 0 (X 0 - 

P(X 0 \B 0 )), where = a(P 0 /P(P 0 )-P I /P(P J )) and P{Z i \B i ^B 0 )) = a(P(P 0 |PiV 
P 0 )/P(P 0 ) - P(B z \B z V P 0 )/P(Pi)) = a(l/P(Pi V B 0 ) - 1 /P(P* V P 0 )) = 0 is, by 
m, the only coherent extension of P on Zi\Bi V Po, i = 1,..., n. In terms of P, 
the gain G is still conditioned on P, because P is also the logical sum of the new 
conditioning events: P = V"= i v VILi(-®i v -Bo) V Po- It follows sup G|P > 0 by 
dF-coherence of P. 

The proof of the second part of the proposition follows at once from noting that 
when X\B = 0|P (fl5l) reduces to P(0|P) = infp g y»{ }. □ 

Theorem |T] is not a characterisation theorem, and cannot obviously be applied 
to arbitrary 2? and £. One reason for presenting it is that it supplies us with a way 


^There is a conceptual difference with coherence: since convexity does not imply Q(X\B) > 
inf( X\B) (internality), the finiteness condition of the infimum must be required in Proposition [8] 
Internality holds when the convex previsions are centered. This fact exemplifies that convexity 
without centering may be a rather weak consistency requirement. 








20 


RENATO PELESSONI AND PAOLO VICIG 


of assessing centered convex previsions in the particular, but important case that 
P{B) > 0, V B € B, V P e V. 

Another motivation is that it informs us, through (1151) . about the type of func¬ 
tions upon which the infimum is performed. Convexity requires adding a term 
4>p(B) to any dF-coherent prevision P(X\B). This term is equal to in Theo¬ 
rem [Tj If P is unconditional, it reduces to a(P), if it is W-coherent, <f>p = 0, and 
we come to the familiar envelope theorems in [26] [29]. 

An envelope theorem which characterises centered convexity is given in m, 
Theorem 8. We do not report it here, but stress the fact that its practical use 
is considerably less immediate than the envelope theorem for W-coherence. In 
fact, the set on which the infimum is performed depends on X\B in this theorem. 
Also the function cf>p(B ) has a more complex structure, which is influenced by the 
ordering of zero probabilities, for each P € V 1 among the possible conditioning 
events. Seemingly, it is technically possible to circumvent such difficulties with W- 
coherence because the function <f>p(B) may be set identically equal to zero there. 

Thus W-coherence remains so far the most general concept for which D) in 
Section 0 has a general practical as well as theoretical significance among those 
discussed in this paper. 


5. Conclusions 

We summarise our conclusions about the role of Williams coherence with the 
help of Table [lj where consistency concepts for precise (first) and imprecise pre¬ 
visions (then) are listed in order of increasing generality. Undoubtedly, a strong 

Table 1 . Some consistency concepts for precise and imprecise previsions 



Type of 
Prevision 

A) 

Structure 

Free 

B) 

Extension 

Theorem 

D) Envelope 
Theorem 

Characterisation 

de Finetti - 
coherence 

Precise, 

unconditional 

Yes 

Yes 

Does not apply 

de Finetti - 
coherence 

Precise, 

conditional 

Yes, in later studies 

Does not apply 

Coherence 

Lower, 

unconditional 

Yes 

Yes 

Yes 

Walley- 

coherence 

Lower, 

conditional 

No 

Not always 

Not always 

W-coherence 

Lower, 

conditional 

Yes 

Yes 

Yes 

Centered 

convexity 

Lower, 

conditional 

Yes 

Yes 

Yes (with 
operational 
constraints) 


motivation for adopting the variant of Williams coherence called W-coherence in 
this paper is its generality: it meets all the properties we listed in Section l2Tl a fea¬ 
ture shared by coherence for unconditional lower previsions and the root concept of 
dF-coherence. Even the notion beyond W-coherence, i.e. centered convexity, while 
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being more general (but weaker) under many respects, fails to ensure a general enve¬ 
lope theorem of comparable ease of use. If we restrict our attention to W-coherence 
versus Walley-coherence, we may conclude that whenever they are not equivalent (if 
they are we may adopt either one) the choice depends essentially on our willingness 
to accept some conglomerative axiom, and some at a large extent consequent do¬ 
main constraints (acceptance of both items results in preferring Walley-coherence). 
Given that W-coherence is more general than Walley-coherence, we may even use 
W-coherence in principle, and Walley-coherence under specific circumstances, for 
instance when studying stochastic processes. This case copes well with the domain 
constraints of Walley-coherence, when we are interested in lower previsions like 
P_(X n | A"T-, 1 (Xi = Xi)), where Xi is a generic value for the random variable X t . In 
fact, the events A"^ (W = xf) form a partition £>„_i, for a given n and by varying 
xi ,..., x n -i in all (jointly) possible ways. 

Similarly, new information in statistical inference may commonly arise from a 
partition of possible hypotheses. Again, this is a favourable situation to apply 
Walley-coherence, as for its domain constraints, and is in fact largely discussed in 
[26] . It has also to be noted that cases where Walley-coherence ensures the existence 
of a (conglomerable) extension are pointed out in |26| , and that they are of a certain 
generality. In other words, the ‘not always’ at the crossing of Walley-coherence and 
property B) in Table Q] should be graded. 

More generally, the theory of imprecise probabilities shows that there are often 
many alternatives for generalising familiar concepts (for instance, independence) 
from theories of precise probabilities or previsions, and that frequently there is 
no way to keep all the properties of the special precise probability case. Under 
these circumstances, we might want to employ different concepts of conditional 
consistency, to preserve obtaining certain aims. A presentation of these conflicting 
instances is given in [28] , where some alternative notions of imprecise conditional 
probability are presented. A further investigation of the consistency concepts in the 
conditional environment should include also these aspects, as well as other ideas 
developed in the literature. In particular, the game-theoretic approach in ii 
was recently related to Walley’s [5], and this could simplify the potential future 
work of relating it with Williams’ approach too. 

Appendix. Proof of Proposition [0 

We preliminarily recall a characterisation theorem, holding for W-coherent con¬ 
ditional lower previsions defined on a structured domain T>* [29] . 

Theorem 2. Let X be a linear space of bounded random variables, £ C X the set 
of all indicator functions of events in X. Let also 1 £ £ and BX £ X, MB £ £, 
MX £ X. Define £’ = £ - {0}, V* = {X\B : X £ X,B £ £ 0 }. P : V* -A- M is a 
W-coherent conditional lower prevision if and only if: 

Al) P{X\B) > M{X\B},MX\B £ V* 

A2) P{kX\B) = kP{X\B),MX\B £ V*, Mk > 0 

A3) P{X + Y\B) > P{X\B) + P{Y\B),MX\B,Y\B £V* 

A4) PfA(X - P(X\A A B))\B) = 0, VAT e X,MA,B £ £' : A A B ^ 0. 

As in the unconditional case 1261 . the concept of natural extension plays a fun¬ 
damental role in extending P. 
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Definition 11. Let P _: V —> K be a conditional lower prevision, X\B an arbitrary 
bounded conditional random variable. Define gt = SiBfiXi — P_(Xi\Bi)), L(X\B) = 
{a : sup{J]r=i 9 i - B{X - o)| V"=i B i V B} < 0, for some n > 0, G V, Si > 
0}. The natural extension of P_ to X\B is E[X\B) = supL(X|S). 

It is easily seen that L(X\B) =] — oo, E_(X\B)[, a fact which will be used later. 
Moreover, the natural extension proves to be bounded from above, when P is W- 
coherent. 

Proposition 9. Let P : D —> K be a W-coherent conditional lower prevision. Then 
E(X\B) < sup{X|H} VX\B. 

Proof. Let c = sup{A|H}, n > 0, X t \Bi 6 P, Sj > 0 (?' = l,...,n). Since 
B(X — c) < 0, using also W-coherence of P in the last inequality, supjE'Li gt — 
B(X - c)I V "=1 Bi V B} > sup{£Ei 9i\ V"=i B i} - 0. This implies c £ L(X\B) = 
] -oo,E(X\B)[. □ 

Theorem 3. Let V* be defined as in Theorem @ V C V* and P_ : V —> K W- 
coherent. Then E is a W-coherent conditional lower prevision onD* and E(X\B) = 
P(X\B) VX\B G V. 

Proof. To prove W-coherence of E_, we show that it satisfies properties Al), A2), 
A3), A4) in Theorem [5] 

As for Al), note that sup{— B(X — a)| B} < sup{— B(X — inf{A|f?})|f?} < 0, 

V X\B G £>*, V a < inf{X|H}. This implies E(X\B) > inf{A|H}. 

As for A2), let k > 0 (the case k = 0 is trivial), a G L(X\B), n > 0, Xi\Bi G V, 
Si > 0 (i = 1 ,...,n), Wi = Y^i=i9i ~ B{X — a) as in Definition fill Then, 
sup{ElLi ksiiXi-PfXilBi)) - B{kX - ka )| Mti B % VB} = fcsup{Wi| V”=i B l V 
B} < 0. This implies ka G L{kX\B ) Vfc > 0, Vo G L(X\B). Hence E_(kX\B) > 
kE_{X\B). The proof of the reverse inequality is similar. 

To prove A3), let Y\B G V*, /3 G L(Y\B), m > 0, YfiCfi G V, t :j > 0 (j = 
1,... ,m), hj = tjCj(Yj — P{Yj\Cj)) such that, defining W 2 = Ej=i hj — B(Y — fi), 
sup{H 2 | V”Li Cj V B} < 0. Preliminarily, write H = V"= 1 B i V VyLi C'j V B as 
the sum of four disjoint events as follows: H = B V [V"=i B i A (VyLi Cj) c A B c ] V 
[(V"=i Bi) c A Vjli Cj A B c ] V [V” = i Bi A VjLi Cj A B c ], Observe also that sup Wi, 
sup IV ‘2 are both non-positive, but never simultaneously null, conditional on each of 
the four events. This implies sup{VFi + W 2 \H} = sup{E”=i 9i + Ejli hj ~ B{X + 

Y ~{a + fi))\H} < 0. Hence a+fi G L(X + Y\B) Ma G L(X\B),Vfi G L(Y\B) and 
E(X + Y\B ) > E(X\B) +E{Y\B) follows. 

As for A4), let X\A AH G V*, W = A(X - E{X\A A B)). To prove that 
E_{W\B) = supL(W|H) = 0, we show that L(W\B) =] — 00 ,0[. Given S > 0, it en¬ 
sues from the definition of E_(X\AAB) that 3 n > 0, Xi\Bi GP, Sj > 0 (i = 1,..., n) 
such that, defining G = EEi s iBi(Xi — P_{Xi\Bi )) and Z\ = G — AB(X — 
E(X\A A B) + 5), sup{Zi| VE 1 BiW {A A B)} < 0. Hence Z 2 = G - B{W + S) = 
Zi — BA C S (< Zi) is such that sup{Z 2 | V"=i A: V B) = max{sup{Z 2 | Vr=i B i V(AA 
B)}, sup{Z 2 |(\/" = i Bi) c A A c A B}} < max{sup{Zi| 

MU BiV(AAB)}, —<5} < 0 (omit the second argument in the maxima if (V"=i Bi) c /\ 
A c A B = 0). This implies —5 G L(W\B), V<5 > 0, hence supL(H / |H) > 0. But 
supL(W|H) = 0, because 0 ^ L(W\B): by contradiction, assuming 0 G L(W\B) 
would imply, as can be easily seen, H(X|AA B) G L(X\AAB) =] — 00 , H(A|AAH)[. 
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Finally, we prove that E_(X\B) = P_{X\B) MX\B £ V. If X\B £ V , taking 
n = 1, Si = 1, Xi\Bi = X\B in the definition of E_(X\B), sup{i?(X — P_(X\B)) — 
B(X - a)\B} = a-P{X\B) < 0, Va < P{X\B). Hence E{X\B) > P(X\B). For 
the reverse inequality, note that VX| B £ V , VX^Bi £ V, Vsi > 0 (i = 1,... ,n), 
snplE'li 9i ~ B(X — P(X\B))\ V"=i B t V B} > 0, by the coherence of P on V. It 
ensues P{X\B) $. L{X\B) =] - oo,E(X\B)[. □ 

Theorem[3]lets us extend any W-coherent conditional lower prevision P _: V —> R. 
to any set T>'(d V) which meets the structure requirements of V* in Theorem [2] 
The set V does not necessarily satisfy these requirements. When it does not, 
consider a partition B on which the random variables in T>' are defined and let X 
be the set of all random variables on B , and V* as in Theorem [2] By Theorem 
[3l E_ : V* —j► K. is a W-coherent conditional lower previsions, extending P_ to V* 
and therefore to V C V* as well. 
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